<% component = metadata.transforms.tokenizer %>

<%= component_header(component) %>

## Config File

<%= component_config_example(component) %>

## Options

<%= options_table(component.options.to_h.values.sort) %>

## Examples

Given the following log line:

{% code-tabs %}
{% code-tabs-item title="log" %}
```json
{
  "message": "5.86.210.12 - zieme4647 [19/06/2019:17:20:49 -0400] "GET /embrace/supply-chains/dynamic/vertical" 201 20574"
}
```
{% endcode-tabs-item %}
{% endcode-tabs %}

And the following configuration:

{% code-tabs %}
{% code-tabs-item title="vector.toml" %}
```coffeescript
[transforms.<transform-id>]
type = "tokenizer"
field = "message"
fields = ["remote_addr", "ident", "user_id", "timestamp", "message", "status", "bytes"]
```
{% endcode-tabs-item %}
{% endcode-tabs %}

A [`log` event][docs.log_event] will be emitted with the following structure:

```javascript
{
  // ... existing fields
  "remote_addr": "5.86.210.12",
  "user_id": "zieme4647",
  "timestamp": "19/06/2019:17:20:49 -0400",
  "message": "GET /embrace/supply-chains/dynamic/vertical",
  "status": "201",
  "bytes": "20574"
}
```

A few things to note about the output:

1. The `message` field was overwritten.
2. The `ident` field was dropped since it contained a `"-"` value.
3. All values are strings, we have plans to add type coercion.
4. [Special wrapper characters](#special-characters) were dropped, such as
   wrapping `[...]` and `"..."` characters.


## How It Works [[sort]]

<%= component_sections(component) %>

### Blank Values

Both `" "` and `"-"` are considered blank values and their mapped field will
be set to `null`.

### Special Characters

In order to extract raw values and remove wrapping characters, we must treat
certain characters as special. These characters will be discarded:

* `"..."` - Quotes are used tp wrap phrases. Spaces are preserved, but the wrapping quotes will be discarded.
* `[...]` - Brackets are used to wrap phrases. Spaces are preserved, but the wrapping brackets will be discarded.
* `\` - Can be used to escape the above characters, Vector will treat them as literal.

## Troubleshooting

<%= component_troubleshooting(component) %>

## Resources

<%= component_resources(component) %>